智能论文笔记

VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids

Katja Schwarz , Axel Sauer , Michael Niemeyer , Yiyi Liao , Andreas Geiger

分类：计算机视觉

2022-06-15

最先进的3D感知生成模型依赖于基于坐标的MLP来参数化3D辐射场。在证明令人印象深刻的结果的同时，请查询每个沿每个射线样品的MLP，都会导致渲染缓慢。因此，现有方法通常会呈现低分辨率特征图，并通过UPSMPLING网络处理以获取最终图像。尽管有效，神经渲染通常纠缠于观点和内容，从而改变摄像头会导致几何或外观的不必要变化。在基于体素的新型视图合成中的最新结果中，我们研究了本文中稀疏体素电网表示的快速和3D一致生成建模的实用性。我们的结果表明，当将稀疏体素电网与渐进式生长，自由空间修剪和适当的正则化结合时，单层MLP确实可以被3D卷积代替。为了获得场景的紧凑表示并允许缩放到更高的体素分辨率，我们的模型将前景对象（以3D模型）从背景（以2D模型建模）中。与现有方法相反，我们的方法仅需要单个正向通行证来生成完整的3D场景。因此，它允许从任意观点呈现有效渲染，同时以高视觉保真度产生3D一致的结果。

translated by 谷歌翻译

On the Frequency Bias of Generative Models

Katja Schwarz , Yiyi Liao , Andreas Geiger

分类：计算机视觉

2021-11-03

创成对抗性网络（甘斯）的主要目标是产生相同的统计数据所提供的培训数据的新数据。然而，最近的多部作品表明，国家的最先进的架构又斗争，以实现这一目标。特别地，他们报告的升高量在光谱统计这使得它可以直接区分真实和生成的图像的高频率。对于这种现象的解释是有争议的：虽然大多数的作品属性文物发电机，其他作品指向鉴别。我们需要在这些解释清醒的审视，并提供有关什么使有效的打击高频文物提出的措施的见解。要做到这一点，我们首先独立评估发电机和鉴别两者的架构，如果他们表现出的频率偏差，使学习的高频含量尤其成问题的分布调查。基于这些实验中，我们提出以下四点看法：1）不同的采样操作偏向不同光谱特性的发电机。 2）由上采样引入的伪像棋盘不能单独解释的光谱差异作为发电机能够补偿这些伪影。 3）鉴别器不与检测本身高频纠缠，但具有低幅度的频率上而奋斗。 4）在鉴别器的下采样操作可以削弱它提供的训练信号的质量。在这些研究结果，我们分析提出了在国家的最先进的甘训练对高频文物的措施，但发现没有现有的方法可以彻底解决谱伪呢。我们的研究结果表明，有很大的潜力，在提高鉴别和，这可能是关键的训练数据的分布更紧密地匹配。

translated by 谷歌翻译

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

Katja Schwarz , Yiyi Liao , Michael Niemeyer , Andreas Geiger

分类：

2020-07-05

While 2D generative adversarial networks have enabled high-resolution image synthesis, they largely lack an understanding of the 3D world and the image formation process. Thus, they do not provide precise control over camera viewpoint or object pose. To address this problem, several recent approaches leverage intermediate voxel-based representations in combination with differentiable rendering. However, existing methods either produce low image resolution or fall short in disentangling camera and scene properties, e.g., the object identity may vary with the viewpoint. In this paper, we propose a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene. In contrast to voxelbased representations, radiance fields are not confined to a coarse discretization of the 3D space, yet allow for disentangling camera and scene properties while degrading gracefully in the presence of reconstruction ambiguity. By introducing a multi-scale patch-based discriminator, we demonstrate synthesis of high-resolution images while training our model from unposed 2D images alone. We systematically analyze our approach on several challenging synthetic and real-world datasets. Our experiments reveal that radiance fields are a powerful representation for generative image synthesis, leading to 3D consistent models that render with high fidelity.

translated by 谷歌翻译

Efficient aggregation of face embeddings for decentralized face recognition deployments (extended version)

Philipp Hofer , Michael Roland , Philipp Schwarz , Renè Mayrhofer

分类：人工智能 | 计算机视觉

2022-12-20

Biometrics are one of the most privacy-sensitive data. Ubiquitous authentication systems with a focus on privacy favor decentralized approaches as they reduce potential attack vectors, both on a technical and organizational level. The gold standard is to let the user be in control of where their own data is stored, which consequently leads to a high variety of devices used. Moreover, in comparison with a centralized system, designs with higher end-user freedom often incur additional network overhead. Therefore, when using face recognition for biometric authentication, an efficient way to compare faces is important in practical deployments, because it reduces both network and hardware requirements that are essential to encourage device diversity. This paper proposes an efficient way to aggregate embeddings used for face recognition based on an extensive analysis on different datasets and the use of different aggregation strategies. As part of this analysis, a new dataset has been collected, which is available for research purposes. Our proposed method supports the construction of massively scalable, decentralized face recognition systems with a focus on both privacy and long-term usability.

translated by 谷歌翻译

Understanding Text Classification Data and Models Using Aggregated Input Salience

Sebastian Ebert , Alice Shoshana Jakobovits , Katja Filippova

分类：自然语言处理

2022-11-10

Realizing when a model is right for a wrong reason is not trivial and requires a significant effort by model developers. In some cases, an input salience method, which highlights the most important parts of the input, may reveal problematic reasoning. But scrutinizing highlights over many data instances is tedious and often infeasible. Furthermore, analyzing examples in isolation does not reveal general patterns in the data or in the model's behavior. In this paper we aim to address these issues and go from understanding single examples to understanding entire datasets and models. The methodology we propose is based on aggregated salience maps. Using this methodology we address multiple distinct but common model developer needs by showing how problematic data and model behavior can be identified -- a necessary first step for improving the model.

translated by 谷歌翻译

Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Eric Enouen , Katja Mathesius , Sean Wang , Arielle Carr , Sihong Xie

分类：机器学习

2022-09-15

多目标优化（MOO）旨在同时优化多个冲突的目标，并在机器学习中发现了重要的应用，例如最大程度地减少分类损失和差异，以在处理不同的人群方面以保持公平。最佳性，进一步优化一个目标至少将至少损害另一个目标，而决策者需要全面探索多个Optima（称为Pareto Front），以确定一个最终解决方案。我们解决了寻找帕累托阵线的效率。首先，使用随机多偏差下降（SMGD）从头开始寻找前部，对于大型神经网络和数据集很昂贵。我们建议基于预测器 - 校正方法来探索帕累托阵线作为一些初始Optima的歧管。其次，对于每个探索步骤，预测变量求解一个大规模的线性系统，该系统在模型参数数量中二次缩放，并且需要一个反向传播来评估求解器的二阶Hessian-vector产品。我们提出了一个只能线性缩放的高斯 - 纽顿近似，并且只需要每次迭代的一阶内产物。这还允许在大约求解线性系统时，在微小和共轭梯度方法之间进行选择。这些创新使大型网络成为可能的预测器 - 校准。关于多目标（公平和准确性）错误信息检测任务的实验表明，1）预测器 - 矫正器方法可以在更少的时间内找到比或与SMGD更好或与SMGD相似的方法； 2）提出的一阶方法不会损害二阶方法识别的帕累托前沿的质量，同时进一步缩短了运行时间。

translated by 谷歌翻译

How to Find Strong Summary Coherence Measures? A Toolbox and a Comparative Study for Summary Coherence Measure Evaluation

Julius Steen , Katja Markert

分类：自然语言处理

2022-09-14

自动评估摘要的连贯性具有重要意义，既可以实现成本效益的摘要评估，又可以通过选择高分候选候选摘要来提高连贯性。尽管已经提出了许多不同的方法来建模摘要相干性，但通常使用不同的数据集和指标对其进行评估。这使得很难理解他们的相对性能，并确定朝着更好的摘要连贯建模的方法。在这项工作中，我们对各种方法进行了大规模研究，以进行均匀的竞争环境建模。此外，我们介绍了两项新的分析措施，即系统内相关性和偏置矩阵，它们有助于确定相干度量的偏见，并为系统级混杂因素提供鲁棒性。尽管当前可用的自动连贯性措施都无法为所有评估指标的系统摘要分配可靠的连贯分数，但对自我监督任务进行了微调的大规模语言模型显示出令人鼓舞的结果，只要微调会考虑在内他们需要在不同的摘要长度上概括。

translated by 谷歌翻译

MLExchange -- A web-based platform enabling exchangeable machine learning workflows

Zhuowen Zhao , Tanny Chavez , Elizabeth Holman , Guanhua Hao , Adam Green , Harinarayan Krishnan , Dylan McReynolds , Ronald Pandolfi , Eric J. Roberts , Petrus H. Zwart

分类：机器学习 | 人工智能

2022-08-20

机器学习（ML）算法在帮助不同学科和机构的科学社区解决大型和多样化的数据问题方面表现出了增长的趋势。但是，许多可用的ML工具在编程方面要求且计算成本高昂。 MlexChange项目旨在建立一个配备有能力工具的协作平台，该平台使科学家和设施使用者没有深刻的ML背景来使用ML和计算资源进行科学发现。在高水平上，我们针对完整的用户体验，在该体验中，可以通过Web应用程序可以轻松获得管理和交换ML算法，工作流和数据。到目前为止，我们已经构建了四个主要组件，即中央职位管理器，集中式内容注册表，用户门户和搜索引擎，并成功地将这些组件部署到了测试服务器上。由于每个组件都是一个独立的容器，因此可以轻松地在不同尺度的服务器上部署整个平台或其个人服务，从笔记本电脑（通常是单个用户）到高性能群集（HPC）（同时）通过许多用户。因此，MlexChange使用方案使灵活性变得灵活 - 用户可以从远程服务器访问服务和资源，也可以在其本地网络中运行整个平台或其个人服务。

translated by 谷歌翻译

Artifact Identification in X-ray Diffraction Data using Machine Learning Methods

Howard Yanxon , James Weng , Hannah Parraga , Wenqian Xu , Uta Ruett , Nicholas Schwarz

分类：计算机视觉 | 机器学习

2022-07-29

研究人员高度利用了原位同步加速器高能X射线粉末衍射（XRD）技术，可以分析功能设备（例如电池材料）或复杂样品环境中材料的晶体结构反应堆）。材料的原子结构可以通过其衍射模式以及详细的分析（例如Rietveld的细化）来识别，该分析表明测量的结构如何偏离理想结构（例如内部应力或缺陷）。对于原位实验，通常在不同条件下（例如绝热条件）在同一样本上收集一系列XRD图像，产生不同的物质状态，或者简单地作为时间的时间连续收集，以跟踪样品的变化超过化学或物理过程。原位实验通常与区域探测器一起进行，收集由理想粉末的衍射环组成的2D图像。根据材料的形式，人们可能会观察到除现实样本及其环境的典型Debye Scherrer环以外的其他特征，例如纹理或优选方向以及2D XRD图像中的单晶衍射点。在这项工作中，我们介绍了对机器学习方法的研究，以快速可靠地识别XRD图像中的单晶衍射点。在XRD图像整合过程中排除伪影的排除允许精确分析感兴趣的粉末衍射环。我们观察到，当用高度多样的数据集对较小的子集进行训练时，梯度提升方法可以始终如一地产生高精度的结果。与常规方法相比，该方法大大减少了识别和分离单晶斑所花费的时间。

translated by 谷歌翻译

Custom Pretrainings and Adapted 3D-ConvNeXt Architecture for COVID Detection and Severity Prediction

Daniel Kienzle , Julian Lorenz , Robin Schön , Katja Ludwig , Rainer Lienhart

分类：计算机视觉

2022-06-30

由于COVID强烈影响呼吸系统，因此肺CT扫描可用于分析患者健康。我们引入了一个神经网络，用于预测肺损伤的严重程度和使用三维CT扫描检测感染。因此，我们将最新的Convnext模型调整为处理三维数据。此外，我们引入了专门调整的不同训练方法，以提高模型处理三维CT-DATA的能力。为了测试模型的性能，我们参加了第二COV19D严重性预测和感染检测的竞争。

translated by 谷歌翻译